9 research outputs found

    Exploiting concurrency among tasks in partitionable parallel processing systems

    Get PDF
    Includes bibliographical references.One benefit of partitionable parallel processing systems is their ability to execute multiple independent tasks simultaneously. Previous work has identified conditions such that, when there are k tasks to be processed, partitioning the system such that all k tasks are processed simultaneously results in a minimum overall execution time. An alternate condition is developed that provides additional insight into the effects of parallelism on execution time. This result, and previous results, however, assume that execution times are data independent. It will be shown that data-dependent tasks do not necessarily execute faster when processed simultaneously even if the condition is met. A model is developed that provides for the possible variability of a task's execution time and is used in a new framework to study the problem of finding an optimal mapping for identical, independent data-dependent execution time tasks onto partitionable systems. Extension of this framework to situations where the k tasks are non-identical is discussed.This work was supported by the Naval Ocean Systems Center under the High Performance Computing Block, ONT, and by the Office of Naval Research under grant number N00014-90-J-1937

    Aspects of reconfigurable parallel processing systems: Architecture, interconnection, and task allocation

    No full text
    Approaches for providing communications among the processors and memories of large-scale parallel processing systems are often based on the multistage cube and data manipulator topologies. One goal of this research is to provide system designers with the tools and methods to use in deciding which of these two topologies performs best on the basis of a set performance and cost criteria for a given set of implementation parameters. A technique for studying buffered multistage networks is described that is more efficient than simulation techniques and more tractable than analytical methods. The technique is applied to the dilated multistage cube and data manipulator topologies. One interesting characteristic of the data manipulator topology is the existence of multiple disjoint paths through the network for some source and destination combinations. Properties of disjoint paths are derived and used to point out advantages and limits of this characteristic. The organization of the PASM parallel processing system is overviewed and an efficient masking technique for large-scale microprocessor-based SIMD architectures is presented. SIMD architectures require mechanisms that efficiently enable and disable (mask) processors to support flexible programming. Most current SIMD architectures incorporate masking logic that allows them to disable themselves based on data conditional results calculated at the processor\u27s level (local masking). Global processor masks, specified by the control unit, are more efficient for tasks where the masking is data independent. An efficient hybrid masking technique is proposed that supports global as well as local masking. A design for the proposed hybrid mechanism is described. One benefit of partitionable systems is their ability to execute independent tasks simultaneously. Previous work has identified conditions such that, when there are k tasks to be processed, partitioning the system such that all k tasks are processed simultaneously minimizes overall execution time. This result, however, assumes that execution times are data independent. It is shown that data-dependent tasks do not necessarily execute faster when processed simultaneously even if the condition is met. A model is developed that provides for the possible variability of a task\u27s execution time and is used in a framework to study the problem of finding an optimal mapping for identical, independent data-dependent execution time tasks onto partitionable systems

    Methodology for exploiting concurrency among independent tasks in partitionable parallel processing systems, A

    No full text
    Includes bibliographical references (pages 277-278).One benefit of partitionable parallel processing systems is their ability to execute multiple, independent tasks simultaneously. Previous work has identified conditions such that, when there are tasks to be processed, partitioning the system so that all k tasks are processed simultaneously results in a minimum overall execution time. An alternate condition is developed that provides additional insight into the effects of parallelism on execution time. This result and previous results, however, assume that execution times are data independent. It is shown that data-dependent tasks do not necessarily execute faster when processed simultaneously even if the condition is met. A model is developed that provides for the possible variability of a task's execution time and is used in a new framework to study the problem of finding an optimal mapping for identical, independent data-dependent execution time tasks onto partitionable systems. Executing one, some, or all of the k tasks simultaneously is considered. Because this new framework is general, it can also serve as a new method for the study of data-independent tasks. Extension of this framework to situations where the k tasks are nonidentical is discussed

    COMPUTING MULTIPLE QUADRATIC FORMS FOR A MINIMUM VARIANCE DISTORTIONLESS RESPONSE ADAPTIVE BEAMFORMER USING PARALLELISM: ANALYSES AND EXPERIMENTS

    Get PDF
    Data-parallel implementations of the computationally intensive task of solving multiple quadratic forms (MQFs) have been examined. Coupled and uncoupled parallel methods are investigated, where coupling relates to the degree of interaction among the processors. Also, the impact of partitioning a large MQF problem into smaller non-interacting subtasks is studied. Trade-offs among the implementations for various data-size/machine-size ratios are categorized in terms of complex arithmetic operation counts, communication overhead, and memory storage requirements. Furthermore, the impact on performance of the mode of parallelism used is considered, specifically, SIMD versus MIMD versus SIMD/MIMD mixed-mode. From the complexity analyses, it is shown that none of the algorithms presented in this paper is best for all datasize/ machine-size ratios. Thus, to achieve scalability (i.e., good performance as the number of processors available in a machine increases), instead of using a single algorithm, the approach proposed is to have a set of algorithms from which the most appropriate algorithm or combination of algorithms is selected based on the ratio calculated from the scaled machine size. The analytical results have been verified from experiments on the MasPar MP-1 (SIMD), nCUBE 2 (MIMD), and PASM (mixed-mode) prototype

    Phylum XIV. Bacteroidetes phyl. nov.

    No full text

    Quellen- und Literaturverzeichnis

    No full text

    Annual Selected Bibliography

    No full text
    corecore